The assumptions of the linear regression model
نویسنده
چکیده
The paper is prompted by certain apparent deficiences both in the discussion of the regression model in instructional sources for geographers and in the actual empirical application of the model by geographical writers. In the first part of the paper the assumptions of the two regression models, the ‘fixed X’ and the ‘random X’, are outlined in detail, and the relative importance of each of the assumptions for the variety of purposes for which regression analysis may be employed is indicated. Where any of the critical assumptions of the model are seriously violated, variations on the basic model must be used and these are reviewed in the second half of the paper. THE rapid increase in the employment of mathematical models in planning has led R. J. Colenutt to discuss ‘some of the problems and errors encountered in building linear models for prediction’. Colenutt rightly points out that the mathematical framework selected for such models ‘places severe demands on the model builder because it is associated with a highly restrictive set of assumptions . . . and it is therefore imperative that, if simple linear models are to be used in planning, their limitations should be clearly understood’. These models have also been widely used in geography, for descriptive and inferential purposes as well as for prediction, and there is abundant evidence that, like their colleagues in planning, many geographers, when employing these models, have not ensured that their data satisfied the appropriate assumptions. Thus many researchers appear to have employed linear models either without verifying a sufficient number of assumptions or else after performing tests which are irrelevant because they relate to one or more assumptions not required by the model. Furthermore, many writers, reporting geographical research, have completely omitted to indicate whether any of the assumptions have been satisfied. This last group is ambiguous, and it is clearly not possible, unless the values of the variables are published, to judge whether the correct set of assumptions has been tested or, indeed, to ascertain whether any such testing has been performed at all. This problem partially arises from certain shortcomings in material which has been published with the specific objective, at least inter alia, of instructing geographers on the use of quantitative techniques. All of these sources make either incomplete or inaccurate specifications of the assumptions underlying the application of linear models, although it is encouraging to note that there has been a considerable improvement in the quality of this literature in recent years. Thus, there were four books and two articles published in the early and mid-1960s which may be classified as belonging to this body of literature, 3 yet, in five of these six sources, only one of the assumptions of the model is mentioned and, even
منابع مشابه
A NEW APPROACH FOR PARAMETER ESTIMATION IN FUZZY LOGISTIC REGRESSION
Logistic regression analysis is used to model categorical dependent variable. It is usually used in social sciences and clinical research. Human thoughts and disease diagnosis in clinical research contain vagueness. This situation leads researchers to combine fuzzy set and statistical theories. Fuzzy logistic regression analysis is one of the outcomes of this combination and it is used in situa...
متن کاملMapping of forage Production in Poor Rangelands Haftkel Rangelands Using Sentile-2 Images
Background and objectives: Determining the exact amount of forage production can be of great help to rangeland managers and relevant specialists in determining proper stocking rate. With implementing proper sampling design, remote sensing data could be used to accurately estimate forage production due to the extent of rangelands areas, cost, time spent and other problems in data gathering from ...
متن کاملAssessing Experimental and Intelligent Models in Estimating Reference Evapotranspiration
Introduction: As the most important element in the hydrologic cycle which depends on climate variables such as near-ground wind speed, air temperature, solar radiation, and relative humidity, reference evapotranspiration (ET0) is normally computed through a variety of methods, each of which requires different and in some cases extensive data that are unavailable in many circumstances, especial...
متن کامللزوم توجه به مفروضات مدل ژنتیکی تجزیه دای آلل
Diallel crosses among 6 Avena sativa L. and A. sterilis L. lines and introductions were used to evaluate the validity of the assumptions for the genetic model. Number of days to pollination, plant height at pollination and at maturity, as well as grain and stem protein percentages were evaluated. According to Griffing's method 1 the reciprocal mean squares for all the traits under study were si...
متن کامللزوم توجه به مفروضات مدل ژنتیکی تجزیه دای آلل
Diallel crosses among 6 Avena sativa L. and A. sterilis L. lines and introductions were used to evaluate the validity of the assumptions for the genetic model. Number of days to pollination, plant height at pollination and at maturity, as well as grain and stem protein percentages were evaluated. According to Griffing's method 1 the reciprocal mean squares for all the traits under study were si...
متن کاملRobust Estimation in Linear Regression Model: the Density Power Divergence Approach
The minimum density power divergence method provides a robust estimate in the face of a situation where the dataset includes a number of outlier data. In this study, we introduce and use a robust minimum density power divergence estimator to estimate the parameters of the linear regression model and then with some numerical examples of linear regression model, we show the robustness of this est...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998